Method and system for reducing contexts for context based compression systems

ABSTRACT

For context based compression techniques, for example Context Based YK compression, a method and system for grouping contexts from a given context model together to create a new context model that has fewer contexts, but retains acceptable compression gains compared to the original context model. According to an exemplary embodiment of the method empirical statistics are determined for a file type of a file to be compressed; and the context model is generated by iteratively grouping contexts of an initial context model in accordance with the empirical statistics, the context model having fewer contexts than an initial context model.

CROSS REFERENCE TO RELATED APPLICATION

This is a continuation of U.S. application Ser. No. 12/566,815 filedSep. 25, 2009, which is a continuation of U.S. application Ser. No.12/040,149, filed Feb. 29, 2008, now issued as U.S. Pat. No. 7,616,132,which claims the benefit of U.S. Application No. 60/950,712, filed onJul. 19, 2007, the contents of all of which are incorporated herein byreference.

COPYRIGHT

A portion of the disclosure of this patent document contains materialwhich is subject to copyright protection. The copyright owner has noobjection to the facsimile reproduction by any one of the patentdocument or patent disclosure, as it appears in the Patent and TrademarkOffice patent file or records, but otherwise reserves all copyrightswhatsoever.

FIELD OF THE INVENTION

The present invention relates generally to context models. Moreparticularly, the present invention relates to context based compressiontechniques.

BACKGROUND OF THE INVENTION

In “Efficient Universal Lossless Data Compression Algorithms Based on aGreedy Sequential Grammar Transform—Part One: Without Context Models”,E.-h. Yang and J. C. Kieffer, IEEE Transactions on Information Theory,VOL. 46, NO. 3, May 2000, pp. 755-777, and “Grammar based codes: A newclass of universal lossless source codes,” J. C. Kieffer and E.-h. Yang,IEEE Transactions on Information Theory, VOL. 46, pp. 737-754, May 2000,a compression algorithm which uses a grammar transform to construct asequence of irreducible context free grammars to compress a datasequence is described. The entire contents of both are herebyincorporated by reference. This algorithm has been called the YKcompression algorithm in the art, and will be so referred herein. The YKcompression algorithm describes a set of reduction rules for producingan irreducible grammar for encoding an original data sequence. Thisgrammar can then be used to recover the original data sequence.

In many instances, such as compression of web pages, java applets, ortext files, there is often some a priori knowledge about the datasequences being compressed. This knowledge can often take the form ofso-called “context models.” Accordingly, context based compressiontechniques are particularly efficient for encoding web pages in whichthe content of a web page changes often, while the underlying structureof the web page remains approximately constant. The relative consistencyof the underlying structure provides the predictable context for thedata as it is compressed.

U.S. Pat. No. 6,801,141, issued on Oct. 5, 2004 to En-Hui Yang and Da-KeHe, and “Efficient Universal Lossless Data Compression Algorithms Basedon a Greedy Sequential Grammar Transform—Part Two: With Context Models”,En-Hui Yang and Da-Ke He, IEEE Transactions on Information Theory, VOL.49, NO. 11, November 2003, pp. 2874-2894 both describe an improvement tothe YK compression algorithm by using contexts, and both of which arehereby incorporated by reference in their entirety—as are the referencescited therein. We will refer to the methods and techniques describedtherein as context based YK compression (CBYK).

One aspect of the CBYK described therein relates to a method ofsequentially transforming an original data sequence associated with aknown context model into an irreducible context-dependent grammar, andrecovering the original data sequence from the grammar. The methodincludes the steps of parsing a substring from the sequence, generatingan admissible context-dependent grammar based on the parsed substring,applying a set of reduction rules to the admissible context dependentgrammar to generate a new irreducible context-dependent grammar, andrepeating these steps until the entire sequence is encoded. In addition,a set of reduction rules based on pairs of variables and contextsrepresents the irreducible context-dependent grammar such that the pairsrepresent non-overlapping repeated patterns and contexts of the datasequence.

CBYK compression can provide significant compression gains over thecontext-free YK compression algorithm, especially when it is combinedwith interactive compression. In brief, context based YK compressionuses the context as a form of predictor of the next parsed symbol orphrase and the corresponding estimated conditional probability forcoding, in order to achieve good compression. In theory, the better thecontext model used by the CBYK, the more likely the compression ratewill be optimized.

In general, for (CBYK) compression, a good context model acts as a goodform of predictor of the next parsed symbol or phrase. In this regard,improvements to the context model can increase the effectiveness of thecompression. However, if improving the context model increases the sizeof the context model, practical limits need to be considered.

It has been found that the memory requirements used to process the CBYKincrease significantly with the size of the context model. For example,if a context model is not chosen properly, the number of grammarvariables can be significantly higher than the number in context-free YKresulting in higher memory usage. If the memory usage of CBYK exceedsthe constraints or the available capacity, then the use of CBYK is notdesirable regardless of how significant the increase in compression gainis. Depending on the application and the devices running the CBYK, evena simple context model, such as using the last byte of the previousparsed phrase as the context, can exceed memory constraints. As thecontext length grows, the number of contexts grows exponentially. On theother hand, since CBYK uses in general more resources than context-freeYK, it would not be preferable without a benefit in compression gain.

Therefore, it is desirable to provide a method for creating a contextmodel, for example, for use with CBYK, that provides a suitabletrade-off between memory requirements and compression gain. Inparticular, it is desirable to provide a context model that uses lessmemory, but still retains acceptable compression gains, compared tolarger context models.

BRIEF DESCRIPTION OF THE DRAWINGS

Aspects and features of the present invention will become apparent tothose ordinarily skilled in the art upon review of the followingdescription of specific embodiments of the invention in conjunction withthe accompanying figures. For a better understanding of the variousembodiments described herein and to show more clearly how they may becarried into effect, reference will now be made, by way of example only,to the accompanying drawings which show at least one exemplaryembodiment and in which:

FIG. 1 is a block diagram of an exemplary embodiment of a mobile device;

FIG. 2 is a block diagram of an exemplary embodiment of a communicationsubsystem component of the mobile device of FIG. 1;

FIG. 3 is an exemplary block diagram of a node of a wireless network;

FIG. 4 is a block diagram illustrating components of a host system inone exemplary configuration for use with the wireless network of FIG. 3and the mobile device of FIG. 1;

FIG. 5 is a flow chart illustrating method steps according to oneembodiment.

FIG. 6 is a flow chart illustrating method steps according to anotherembodiment.

FIG. 7 is a block diagram which illustrates an apparatus for performingCBYK compression using a reduced context set, according to oneembodiment.

FIG. 8 is a flowchart illustrating how to determine the reduced contextto be used in each iteration of a CBYK compression, according to anembodiment of the invention.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

In the following description, for purposes of explanation, numerousdetails are set forth in order to provide a thorough understanding ofthe present invention. However, it will be apparent to one skilled inthe art that these specific details are not required in order topractice the present invention. It will be appreciated that forsimplicity and clarity of illustration, where considered appropriate,reference numerals may be repeated among the figures to indicatecorresponding or analogous elements. In addition, numerous specificdetails are set forth in order to provide a thorough understanding ofthe embodiments described herein. However, it will be understood bythose of ordinary skill in the art that the embodiments described hereinmay be practiced without these specific details. In other instances,well-known methods, procedures and components have not been described indetail so as not to obscure the embodiments described herein. In otherinstances, well-known electrical structures and circuits are not shownin block diagram form in order not to obscure the present invention. Forexample, specific details are not provided as to whether the embodimentsof the invention described herein are implemented as a software routine,hardware circuit, firmware, or a combination thereof. Also, thedescription is not to be considered as limiting the scope of theembodiments described herein.

Embodiments of the invention may be represented as a software productstored in a machine-readable medium (also referred to as acomputer-readable medium, a processor-readable medium, or a computerusable medium having a computer-readable program code embodied therein).The machine-readable medium may be any suitable tangible medium,including magnetic, optical, or electrical storage medium including adiskette, compact disk read only memory (CD-ROM), memory device(volatile or non-volatile), or similar storage mechanism. Themachine-readable medium may contain various sets of instructions, codesequences, configuration information, or other data, which, whenexecuted, cause a processor to perform steps in a method according to anembodiment of the invention. Those of ordinary skill in the art willappreciate that other instructions and operations necessary to implementthe described invention may also be stored on the machine-readablemedium. Software running from the machine-readable medium may interfacewith circuitry to perform the described tasks.

We describe herein methods and systems for providing a context model,for example for CBYK. Preferred embodiments will be described withreference to a specific example of a CBYK compression process, to whichthe discussed embodiments are well suited. However, it should be notedthat the invention is not so limited, and is more generally applicablefor other context based compression techniques, or other systems whichuse such a context model. A context is the a priori information (basedon previously processed data) that is used to compress the next parsedphrase. A context model defines the set of possible contexts, known asthe context set, and the method for generating the next context. Thecontext sequence is the sequence of contexts that is generated from aspecific parsing of a data sequence. A couple of examples are presentedto illustrate these terms:

Example 1

Consider the example that the binary sequence X==101110101000111 is tobe compressed. Assume that the sequence will be parsed one bit at atime, and the context model is the last two bits, i.e. the contextset={00,01,10,11} and the next context is c_(i+1)=b_(i−1),b_(i). Let theinitial context be 00. Denote the bit b_(i) that is parsed and thecontext for that bit c_(i) be represented by b_(i)|c_(i). The sequenceof parsed symbols is 1100, 0101, 1110, 1101, 1111, 0111, 1110, 0101,1110, 0101, 0110, 0100, 1100, 1101, 1111. The context sequence is 00,01, 10, 01, 11, 11, 10, 01, 10, 01, 10, 00, 00, 01, 11.

Example 2

Consider another example where the same binary sequence as above is tobe compressed. However, the context model is defined by the context set{00, 10, 1} and the next context is defined by:

$c_{i + 1} = \left\{ \begin{matrix}1 & {{{if}\mspace{14mu} b_{i}} = 1} \\{b_{i - 1}b_{i}} & {{otherwise}.}\end{matrix} \right.$

The sequence of parsed symbols is 1|00, 0|1, 1|10, 1|1, 1|1, 0|1, 1|10,0|1, 1|10, 0|1, 0|10, 0|00, 1|00, 1|1, 1|1. The context sequence is 00,1, 10, 1, 1, 1, 10, 1, 10, 1, 10, 00, 00, 1, 1.

One aspect of the invention provides a method of generating a contextmodel to be used for context dependent based compression comprising: (a)determining file categorization criteria for a file to be compressed;(b) determining an initial context model including an initial value fora current number of contexts and an initial context set based on thefile categorization criteria; (c) determining, for the initial contextmodel, empirical statistics of contexts and symbols; (d) using theempirical statistics to reduce the current number of contexts to adesired number of contexts; and (e) generating the context model formapping data elements to a set of contexts of size equal to said desirednumber of contexts.

Another aspect of the invention provides a method of developing acontext model for use in a context-based data compression processcomprising: (a) determining an initial context model including aninitial set of contexts containing a first number of contexts for eachof at least one data file; and (b) forming a reduced set of contextswith a smaller number of contexts for each of said at least one datafile by combining contexts in such a manner as to reduce the memoryrequirements necessary for implementing said context-based datacompression process while still achieving a satisfactory compressionrate. According to one embodiment consistent with this aspect, step (a)comprises determining an initial set of contexts and empiricalstatistics for each of a plurality of categories of files and whereinstep (b) comprises applying a grouping function to the each initial setof contexts to combine the contexts into a smaller set of contexts foreach file type based on said empirical statistics for each file type.According to one example, the step of applying a grouping functioncomprises iteratively grouping a pair of contexts together to form agrouped context, wherein each grouped context represents a local minimumbased on said empirical statistics.

Another aspect of the invention provides a system for compressing aninput string x=x₁x₂ . . . x_(n), according to CBYK comprising, saidinput string having an associated context set ε={c¹, c², . . . , c^(j)},said system comprising: (a) a parser for parsing said input string toproduce a substring; (b) a context generator coupled to a first outputof said parser, said context generator accessing a mapping file Mdefining a grouping function g reducing said context set E to a reducedset of contexts={circumflex over (ε)}={ĉ¹, ĉ², . . . , ĉ^(k)} with k<jand said context generator using said context set ε={c¹, c², . . . ,c^(j)}, and said mapping file M to determine the next context, which issupplied to said parser, wherein the next context ĉ_(i+1) is determinedfrom ĉ_(i+1)=g(c_(i+1)), wherein (c_(i+)1) is determined from x₁ afterx₁ is parsed; (c) a context dependent grammar updating device coupled toa second output of said parser and also coupled to the output of saidcontext generator for producing an updated context-dependent grammar G;(d) a Context dependent Grammar Coder coupled to the output of saidcontext dependent grammar updating device for producing a compressedbinary code word from which the original input string can be recovered.

Another aspect of the invention provides a computer program productstored in a machine readable medium comprising instructions, which whenexecuted by a processor of a device, causes said device to carry out themethods described herein.

We will discuss the methods and systems with reference to a particularexemplary application for which the embodiments of the invention arewell suited, namely a wireless network where files are compressed fortransmission to a mobile wireless communication device, hereafterreferred to as a mobile device. Examples of applicable communicationdevices include pagers, cellular phones, cellular smart-phones, wirelessorganizers, personal digital assistants, computers, laptops, handheldwireless communication devices, wirelessly enabled notebook computersand the like.

The mobile device is a two-way communication device with advanced datacommunication capabilities including the capability to communicate withother mobile devices or computer systems through a network oftransceiver stations. The mobile device may also have the capability toallow voice communication. Depending on the functionality provided bythe mobile device, it may be referred to as a data messaging device, atwo-way pager, a cellular telephone with data messaging capabilities, awireless Internet appliance, or a data communication device (with orwithout telephony capabilities). To aid the reader in understanding thestructure of the mobile device and how it communicates with otherdevices and host systems, reference will now be made to FIGS. 1 through4.

Referring first to FIG. 1, shown therein is a block diagram of anexemplary embodiment of a mobile device 100. The mobile device 100includes a number of components such as a main processor 102 thatcontrols the overall operation of the mobile device 100. Communicationfunctions, including data and voice communications, are performedthrough a communication subsystem 104. Data received by the mobiledevice 100 can be decompressed and decrypted by decoder 103, operatingaccording to any suitable decompression techniques (e.g. YKdecompression, and other known techniques) and encryption techniques(e.g. using an encryption techniques such as Data Encryption Standard(DES), Triple DES, or Advanced Encryption Standard (AES)). Thecommunication subsystem 104 receives messages from and sends messages toa wireless network 200. In this exemplary embodiment of the mobiledevice 100, the communication subsystem 104 is configured in accordancewith the Global System for Mobile Communication (GSM) and General PacketRadio Services (GPRS) standards. The GSM/GPRS wireless network is usedworldwide and it is expected that these standards will be supersededeventually by Enhanced Data GSM Environment (EDGE) and Universal MobileTelecommunications Service (UMTS). New standards are still beingdefined, but it is believed that they will have similarities to thenetwork behavior described herein, and it will also be understood bypersons skilled in the art that the embodiments described herein areintended to use any other suitable standards that are developed in thefuture. The wireless link connecting the communication subsystem 104with the wireless network 200 represents one or more different RadioFrequency (RF) channels, operating according to defined protocolsspecified for GSM/GPRS communications. With newer network protocols,these channels are capable of supporting both circuit switched voicecommunications and packet switched data communications.

Although the wireless network 200 associated with mobile device 100 is aGSM/GPRS wireless network in one exemplary implementation, otherwireless networks may also be associated with the mobile device 100 invariant implementations. The different types of wireless networks thatmay be employed include, for example, data-centric wireless networks,voice-centric wireless networks, and dual-mode networks that can supportboth voice and data communications over the same physical base stations.Combined dual-mode networks include, but are not limited to, CodeDivision Multiple Access (CDMA) or CDMA2000 networks, GSM/GPRS networks(as mentioned above), and future third-generation (3G) networks likeEDGE and UMTS. Some other examples of data-centric networks include WiFi802.11, Mobitex™ and DataTAC™ network communication systems. Examples ofother voice-centric data networks include Personal Communication Systems(PCS) networks like GSM and Time Division Multiple Access (TDMA)systems. The main processor 102 also interacts with additionalsubsystems such as a Random Access Memory (RAM) 106, a flash memory 108,a display 110, an auxiliary input/output (I/O) subsystem 112, a dataport 114, a keyboard 116, a speaker 118, a microphone 120, short-rangecommunications 122 and other device subsystems 124.

Some of the subsystems of the mobile device 100 performcommunication-related functions, whereas other subsystems may provide“resident” or on-device functions. By way of example, the display 110and the keyboard 116 may be used for both communication-relatedfunctions, such as entering a text message for transmission over thenetwork 200, and device-resident functions such as a calculator or tasklist.

The mobile device 100 can send and receive communication signals overthe wireless network 200 after required network registration oractivation procedures have been completed. Network access is associatedwith a subscriber or user of the mobile device 100. To identify asubscriber, the mobile device 100 requires a SIM/RUIM card 126 (i.e.Subscriber Identity Module or a Removable User Identity Module) to beinserted into a SIM/RUIM interface 128 in order to communicate with anetwork. The SIM card or RUIM 126 is one type of a conventional “smartcard” that can be used to identify a subscriber of the mobile device 100and to personalize the mobile device 100, among other things. Withoutthe SIM card 126, the mobile device 100 is not fully operational forcommunication with the wireless network 200. By inserting the SIMcard/RUIM 126 into the SIM/RUIM interface 128, a subscriber can accessall subscribed services. Services may include: web browsing andmessaging such as e-mail, voice mail, Short Message Service (SMS), andMultimedia Messaging Services (MMS). More advanced services may include:point of sale, field service and sales force automation. The SIMcard/RUIM 126 includes a processor and memory for storing information.Once the SIM card/RUIM 126 is inserted into the SIM/RUIM interface 128,it is coupled to the main processor 102. In order to identify thesubscriber, the SIM card/RUIM 126 can include some user parameters suchas an International Mobile Subscriber Identity (IMSI). An advantage ofusing the SIM card/RUIM 126 is that a subscriber is not necessarilybound by any single physical mobile device. The SIM card/RUIM 126 maystore additional subscriber information for a mobile device as well,including datebook (or calendar) information and recent callinformation. Alternatively, user identification information can also beprogrammed into the flash memory 108.

The mobile device 100 is a battery-powered device and includes a batteryinterface 132 for receiving one or more rechargeable batteries 130. Inat least some embodiments, the battery 130 can be a smart battery withan embedded microprocessor. The battery interface 132 is coupled to aregulator (not shown), which assists the battery 130 in providing powerV+ to the mobile device 100. Although current technology makes use of abattery, future technologies such as micro fuel cells may provide thepower to the mobile device 100.

The mobile device 100 also includes an operating system 134 and softwarecomponents 136 to 146 which are described in more detail below. Theoperating system 134 and the software components 136 to 146 that areexecuted by the main processor 102 are typically stored in a persistentstore such as the flash memory 108, which may alternatively be aread-only memory (ROM) or similar storage element (not shown). Thoseskilled in the art will appreciate that portions of the operating system134 and the software components 136 to 146, such as specific deviceapplications, or parts thereof, may be temporarily loaded into avolatile store such as the RAM 106. Other software components can alsobe included, as is well known to those skilled in the art.

The subset of software applications 136 that control basic deviceoperations, including data and voice communication applications, willnormally be installed on the mobile device 100 during its manufacture.Other software applications include a message application 138 that canbe any suitable software program that allows a user of the mobile device100 to send and receive electronic messages. Various alternatives existfor the message application 138 as is well known to those skilled in theart. Messages that have been sent or received by the user are typicallystored in the flash memory 108 of the mobile device 100 or some othersuitable storage element in the mobile device 100. In at least someembodiments, some of the sent and received messages may be storedremotely from the device 100 such as in a data store of an associatedhost system that the mobile device 100 communicates with.

The software applications can further include a device state module 140,a Personal Information Manager (PIM) 142, and other suitable modules(not shown). The device state module 140 provides persistence, i.e. thedevice state module 140 ensures that important device data is stored inpersistent memory, such as the flash memory 108, so that the data is notlost when the mobile device 100 is turned off or loses power.

The PIM 142 includes functionality for organizing and managing dataitems of interest to the user, such as, but not limited to, e-mail,contacts, calendar events, voice mails, appointments, and task items. APIM application has the ability to send and receive data items via thewireless network 200. PIM data items may be seamlessly integrated,synchronized, and updated via the wireless network 200 with the mobiledevice subscriber's corresponding data items stored and/or associatedwith a host computer system. This functionality creates a mirrored hostcomputer on the mobile device 100 with respect to such items. This canbe particularly advantageous when the host computer system is the mobiledevice subscriber's office computer system.

The mobile device 100 also includes a connect module 144, and aninformation technology (IT) policy module 146. The connect module 144implements the communication protocols that are required for the mobiledevice 100 to communicate with the wireless infrastructure and any hostsystem, such as an enterprise system, that the mobile device 100 isauthorized to interface with. Examples of a wireless infrastructure andan enterprise system are given in FIGS. 3 and 4, which are described inmore detail below.

The connect module 144 includes a set of APIs that can be integratedwith the mobile device 100 to allow the mobile device 100 to use anynumber of services associated with the enterprise system. The connectmodule 144 allows the mobile device 100 to establish an end-to-endsecure, authenticated communication pipe with the host system. A subsetof applications for which access is provided by the connect module 144can be used to pass IT policy commands from the host system to themobile device 100. This can be done in a wireless or wired manner. Theseinstructions can then be passed to the IT policy module 146 to modifythe configuration of the device 100. Alternatively, in some cases, theIT policy update can also be done over a wired connection.

Other types of software applications can also be installed on the mobiledevice 100. These software applications can be third party applications,which are added after the manufacture of the mobile device 100. Examplesof third party applications include games, calculators, utilities, etc.

The additional applications can be loaded onto the mobile device 100through at least one of the wireless network 200, the auxiliary I/Osubsystem 112, the data port 114, the short-range communicationssubsystem 122, or any other suitable device subsystem 124. Thisflexibility in application installation increases the functionality ofthe mobile device 100 and may provide enhanced on-device functions,communication-related functions, or both. For example, securecommunication applications may enable electronic commerce functions andother such financial transactions to be performed using the mobiledevice 100.

The data port 114 enables a subscriber to set preferences through anexternal device or software application and extends the capabilities ofthe mobile device 100 by providing for information or software downloadsto the mobile device 100 other than through a wireless communicationnetwork. The alternate download path may, for example, be used to loadan encryption key onto the mobile device 100 through a direct and thusreliable and trusted connection to provide secure device communication.

The data port 114 can be any suitable port that enables datacommunication between the mobile device 100 and another computingdevice. The data port 114 can be a serial or a parallel port. In someinstances, the data port 114 can be a USB port that includes data linesfor data transfer and a supply line that can provide a charging currentto charge the battery 130 of the mobile device 100.

The short-range communications subsystem 122 provides for communicationbetween the mobile device 100 and different systems or devices, withoutthe use of the wireless network 200. For example, the subsystem 122 mayinclude an infrared device and associated circuits and components forshort-range communication. Examples of short-range communicationstandards include standards developed by the Infrared Data Association(IrDA), Bluetooth, and the 802.11 family of standards developed by IEEE.

In use, a received signal such as a text message, an e-mail message, orweb page download will be processed by the communication subsystem 104and input to the main processor 102. The main processor 102 will thenprocess the received signal for output to the display 110 oralternatively to the auxiliary I/O subsystem 112. A subscriber may alsocompose data items, such as e-mail messages, for example, using thekeyboard 116 in conjunction with the display 110 and possibly theauxiliary I/O subsystem 112. The auxiliary subsystem 112 may includedevices such as: a touch screen, mouse, track ball, infrared fingerprintdetector, or a roller wheel with dynamic button pressing capability. Thekeyboard 116 is preferably an alphanumeric keyboard and/ortelephone-type keypad. However, other types of keyboards may also beused. A composed item may be transmitted over the wireless network 200through the communication subsystem 104.

For voice communications, the overall operation of the mobile device 100is substantially similar, except that the received signals are output tothe speaker 118, and signals for transmission are generated by themicrophone 120. Alternative voice or audio I/O subsystems, such as avoice message recording subsystem, can also be implemented on the mobiledevice 100. Although voice or audio signal output is accomplishedprimarily through the speaker 118, the display 110 can also be used toprovide additional information such as the identity of a calling party,duration of a voice call, or other voice call related information.

Referring now to FIG. 2, an exemplary block diagram of the communicationsubsystem component 104 is shown. The communication subsystem 104includes a receiver 150, a transmitter 152, as well as associatedcomponents such as one or more embedded or internal antenna elements 154and 156, Local Oscillators (LOs) 158, and a processing module such as aDigital Signal Processor (DSP) 160. The particular design of thecommunication subsystem 104 is dependent upon the communication network200 with which the mobile device 100 is intended to operate. Thus, itshould be understood that the design illustrated in FIG. 2 serves onlyas one example.

Signals received by the antenna 154 through the wireless network 200 areinput to the receiver 150, which may perform such common receiverfunctions as signal amplification, frequency down conversion, filtering,channel selection, and analog-to-digital (A/D) conversion. ND conversionof a received signal allows more complex communication functions such asdemodulation and decoding to be performed in the DSP 160. In a similarmanner, signals to be transmitted are processed, including modulationand encoding, by the DSP 160. These DSP-processed signals are input tothe transmitter 152 for digital-to-analog (D/A) conversion, frequency upconversion, filtering, amplification and transmission over the wirelessnetwork 200 via the antenna 156. The DSP 160 not only processescommunication signals, but also provides for receiver and transmittercontrol. For example, the gains applied to communication signals in thereceiver 150 and the transmitter 152 may be adaptively controlledthrough automatic gain control algorithms implemented in the DSP 160.

The wireless link between the mobile device 100 and the wireless network200 can contain one or more different channels, typically different RFchannels, and associated protocols used between the mobile device 100and the wireless network 200. An RF channel is a limited resource thatmust be conserved, typically due to limits in overall bandwidth andlimited battery power of the mobile device 100.

When the mobile device 100 is fully operational, the transmitter 152 istypically keyed or turned on only when it is transmitting to thewireless network 200 and is otherwise turned off to conserve resources.Similarly, the receiver 150 is periodically turned off to conserve poweruntil it is needed to receive signals or information (if at all) duringdesignated time periods.

Referring now to FIG. 3, a block diagram of an exemplary implementationof a node 202 of the wireless network 200 is shown. In practice, thewireless network 200 comprises one or more nodes 202. In conjunctionwith the connect module 144, the mobile device 100 can communicate withthe node 202 within the wireless network 200. In the exemplaryimplementation of FIG. 3, the node 202 is configured in accordance withGeneral Packet Radio Service (GPRS) and Global Systems for Mobile (GSM)technologies. The node 202 includes a base station controller (BSC) 204with an associated tower station 206, a Packet Control Unit (PCU) 208added for GPRS support in GSM, a Mobile Switching Center (MSC) 210, aHome Location Register (HLR) 212, a Visitor Location Registry (VLR) 214,a Serving GPRS Support Node (SGSN) 216, a Gateway GPRS Support Node(GGSN) 218, and a Dynamic Host Configuration Protocol (DHCP) 220. Thislist of components is not meant to be an exhaustive list of thecomponents of every node 202 within a GSM/GPRS network, but rather alist of components that are commonly used in communications through thenetwork 200.

In a GSM network, the MSC 210 is coupled to the BSC 204 and to alandline network, such as a Public Switched Telephone Network (PSTN) 222to satisfy circuit switched requirements. The connection through the PCU208, the SGSN 216 and the GGSN 218 to a public or private network(Internet) 224 (also referred to herein generally as a shared networkinfrastructure) represents the data path for GPRS capable mobiledevices. In a GSM network extended with GPRS capabilities, the BSC 204also contains the Packet Control Unit (PCU) 208 that connects to theSGSN 216 to control segmentation, radio channel allocation and tosatisfy packet switched requirements. To track the location of themobile device 100 and availability for both circuit switched and packetswitched management, the HLR 212 is shared between the MSC 210 and theSGSN 216. Access to the VLR 214 is controlled by the MSC 210.

The station 206 is a fixed transceiver station and together with the BSC204 form fixed transceiver equipment. The fixed transceiver equipmentprovides wireless network coverage for a particular coverage areacommonly referred to as a “cell”. The fixed transceiver equipmenttransmits communication signals to and receives communication signalsfrom mobile devices within its cell via the station 206. The fixedtransceiver equipment normally performs such functions as modulation andpossibly encoding and/or encryption of signals to be transmitted to themobile device 100 in accordance with particular, usually predetermined,communication protocols and parameters, under control of its controller.The fixed transceiver equipment similarly demodulates and possiblydecodes and decrypts, if necessary, any communication signals receivedfrom the mobile device 100 within its cell. Communication protocols andparameters may vary between different nodes. For example, one node mayemploy a different modulation scheme and operate at differentfrequencies than other nodes.

For all mobile devices 100 registered with a specific network, permanentconfiguration data such as a user profile is stored in the HLR 212. TheHLR 212 also contains location information for each registered mobiledevice and can be queried to determine the current location of a mobiledevice. The MSC 210 is responsible for a group of location areas andstores the data of the mobile devices currently in its area ofresponsibility in the VLR 214. Further, the VLR 214 also containsinformation on mobile devices that are visiting other networks. Theinformation in the VLR 214 includes part of the permanent mobile devicedata transmitted from the HLR 212 to the VLR 214 for faster access. Bymoving additional information from a remote HLR 212 node to the VLR 214,the amount of traffic between these nodes can be reduced so that voiceand data services can be provided with faster response times and at thesame time requiring less use of computing resources.

The SGSN 216 and the GGSN 218 are elements added for GPRS support;namely packet switched data support, within GSM. The SGSN 216 and theMSC 210 have similar responsibilities within the wireless network 200 bykeeping track of the location of each mobile device 100. The SGSN 216also performs security functions and access control for data traffic onthe wireless network 200. The GGSN 218 provides internetworkingconnections with external packet switched networks and connects to oneor more SGSN's 216 via an Internet Protocol (IP) backbone networkoperated within the network 200. During normal operations, a givenmobile device 100 must perform a “GPRS Attach” to acquire an IP addressand to access data services. This requirement is not present in circuitswitched voice channels as Integrated Services Digital Network (ISDN)addresses are used for routing incoming and outgoing calls. Currently,all GPRS capable networks use private, dynamically assigned IPaddresses, thus requiring the DHCP server 220 connected to the GGSN 218.There are many mechanisms for dynamic IP assignment, including using acombination of a Remote Authentication Dial-In User Service (RADIUS)server and a DHCP server. Once the GPRS Attach is complete, a logicalconnection is established from a mobile device 100, through the PCU 208,and the SGSN 216 to an Access Point Node (APN) within the GGSN 218. TheAPN represents a logical end of an IP tunnel that can either accessdirect Internet compatible services or private network connections. TheAPN also represents a security mechanism for the network 200, insofar aseach mobile device 100 must be assigned to one or more APNs and mobiledevices 100 cannot exchange data without first performing a GPRS Attachto an APN that it has been authorized to use. The APN may be consideredto be similar to an Internet domain name such as“myconnection.wireless.com”.

Once the GPRS Attach operation is complete, a tunnel is created and alltraffic is exchanged within standard IP packets using any protocol thatcan be supported in IP packets. This includes tunneling methods such asIP over IP as in the case with some IPSecurity (IPsec) connections usedwith Virtual Private Networks (VPN). These tunnels are also referred toas Packet Data Protocol (PDP) Contexts and there are a limited number ofthese available in the network 200. To maximize use of the PDP Contexts,the network 200 will run an idle timer for each PDP Context to determineif there is a lack of activity. When a mobile device 100 is not usingits PDP Context, the PDP Context can be de-allocated and the IP addressreturned to the IP address pool managed by the DHCP server 220.

Referring now to FIG. 4, shown therein is a block diagram illustratingcomponents of an exemplary configuration of a host system 250 that themobile device 100 can communicate with in conjunction with the connectmodule 144. The host system 250 will typically be a corporate enterpriseor other local area network (LAN), but may also be a home officecomputer or some other private system, for example, in variantimplementations. In this example shown in FIG. 4, the host system 250 isdepicted as a LAN of an organization to which a user of the mobiledevice 100 belongs. Typically, a plurality of mobile devices cancommunicate wirelessly with the host system 250 through one or morenodes 202 of the wireless network 200.

The host system 250 comprises a number of network components connectedto each other by a network 260. For instance, a user's desktop computer262 a with an accompanying cradle 264 for the user's mobile device 100is situated on a LAN connection. The cradle 264 for the mobile device100 can be coupled to the computer 262 a by a serial or a UniversalSerial Bus (USB) connection, for example. Other user computers 262 b-262n are also situated on the network 260, and each may or may not beequipped with an accompanying cradle 264. The cradle 264 facilitates theloading of information (e.g. PIM data, private symmetric encryption keysto facilitate secure communications) from the user computer 262 a to themobile device 100, and may be particularly useful for bulk informationupdates often performed in initializing the mobile device 100 for use.The information downloaded to the mobile device 100 may includecertificates used in the exchange of messages.

It will be understood by persons skilled in the art that the usercomputers 262 a-262 n will typically also be connected to otherperipheral devices, such as printers, etc. which are not explicitlyshown in FIG. 4. Furthermore, only a subset of network components of thehost system 250 are shown in FIG. 4 for ease of exposition, and it willbe understood by persons skilled in the art that the host system 250will comprise additional components that are not explicitly shown inFIG. 4 for this exemplary configuration. More generally, the host system250 may represent a smaller part of a larger network (not shown) of theorganization, and may comprise different components and/or be arrangedin different topologies than that shown in the exemplary embodiment ofFIG. 4.

To facilitate the operation of the mobile device 100 and the wirelesscommunication of messages and message-related data between the mobiledevice 100 and components of the host system 250, a number of wirelesscommunication support components 270 can be provided. In someimplementations, the wireless communication support components 270 caninclude a message management server 272, a mobile data server (MDS) 274,a web server, such as Hypertext Transfer Protocol (HTTP) server 275, acontact server 276, and a device manager module 278. The device managermodule 278 includes an IT Policy editor 280 and an IT user propertyeditor 282, as well as other software components for allowing an ITadministrator to configure the mobile devices 100. In an alternativeembodiment, there may be one editor that provides the functionality ofboth the IT policy editor 280 and the IT user property editor 282. Thesupport components 270 also include a data store 284, and an IT policyserver 286. The IT policy server 286 includes a processor 288, a networkinterface 290 and a memory unit 292. The processor 288 controls theoperation of the IT policy server 286 and executes functions related tothe standardized IT policy as described below. The network interface 290allows the IT policy server 286 to communicate with the variouscomponents of the host system 250 and the mobile devices 100. The memoryunit 292 can store functions used in implementing the IT policy as wellas related data. Those skilled in the art know how to implement thesevarious components. Other components may also be included as is wellknown to those skilled in the art. Further, in some implementations, thedata store 284 can be part of any one of the servers.

In this exemplary embodiment, the mobile device 100 communicates withthe host system 250 through node 202 of the wireless network 200 and ashared network infrastructure 224 such as a service provider network orthe public Internet. Access to the host system 250 may be providedthrough one or more routers (not shown), and computing devices of thehost system 250 may operate from behind a firewall or proxy server 266.The proxy server 266 provides a secure node and a wireless internetgateway for the host system 250. The proxy server 266 intelligentlyroutes data to the correct destination server within the host system250.

In some implementations, the host system 250 can include a wireless VPNrouter (not shown) to facilitate data exchange between the host system250 and the mobile device 100. The wireless VPN router allows a VPNconnection to be established directly through a specific wirelessnetwork to the mobile device 100. The wireless VPN router can be usedwith the Internet Protocol (IP) Version 6 (IPV6) and IP-based wirelessnetworks. This protocol can provide enough IP addresses so that eachmobile device has a dedicated IP address, making it possible to pushinformation to a mobile device at any time. An advantage of using awireless VPN router is that it can be an off-the-shelf VPN component,and does not require a separate wireless gateway and separate wirelessinfrastructure. A VPN connection can preferably be a TransmissionControl Protocol (TCP)/IP or User Datagram Protocol (UDP)/IP connectionfor delivering the messages directly to the mobile device 100 in thisalternative implementation.

Messages intended for a user of the mobile device 100 are initiallyreceived by a message server 268 of the host system 250. Such messagesmay originate from any number of sources. For instance, a message mayhave been sent by a sender from the computer 262 b within the hostsystem 250, from a different mobile device (not shown) connected to thewireless network 200 or a different wireless network, or from adifferent computing device, or other device capable of sending messages,via the shared network infrastructure 224, possibly through anapplication service provider (ASP) or Internet service provider (ISP),for example.

The message server 268 typically acts as the primary interface for theexchange of messages, particularly e-mail messages, within theorganization and over the shared network infrastructure 224. Each userin the organization that has been set up to send and receive messages istypically associated with a user account managed by the message server268. Some exemplary implementations of the message server 268 include aMicrosoft Exchange™ server, a Lotus Domino™server, a NovellGroupwise™server, or another suitable mail server installed in acorporate environment. In some implementations, the host system 250 maycomprise multiple message servers 268. The message server 268 may alsobe adapted to provide additional functions beyond message management,including the management of data associated with calendars and tasklists, for example.

When messages are received by the message server 268, they are typicallystored in a data store associated with the message server 268. In atleast some embodiments, the data store may be a separate hardware unit,such as data store 284, that the message server 268 communicates with.Messages can be subsequently retrieved and delivered to users byaccessing the message server 268. For instance, an e-mail clientapplication operating on a user's computer 262 a may request the e-mailmessages associated with that user's account stored on the data storeassociated with the message server 268. These messages are thenretrieved from the data store and stored locally on the computer 262 a.The data store associated with the message server 268 can store copiesof each message that is locally stored on the mobile device 100.Alternatively, the data store associated with the message server 268 canstore all of the messages for the user of the mobile device 100 and onlya smaller number of messages can be stored on the mobile device 100 toconserve memory. For instance, the most recent messages (i.e. thosereceived in the past two to three months for example) can be stored onthe mobile device 100.

When operating the mobile device 100, the user may wish to have e-mailmessages retrieved for delivery to the mobile device 100. The messageapplication 138 operating on the mobile device 100 may also requestmessages associated with the user's account from the message server 268.The message application 138 may be configured (either by the user or byan administrator, possibly in accordance with an organization's ITpolicy) to make this request at the direction of the user, at somepre-defined time interval, or upon the occurrence of some pre-definedevent. In some implementations, the mobile device 100 is assigned itsown e-mail address, and messages addressed specifically to the mobiledevice 100 are automatically redirected to the mobile device 100 as theyare received by the message server 268.

The message management server 272 can be used to specifically providesupport for the management of messages, such as e-mail messages, thatare to be handled by mobile devices. Generally, while messages are stillstored on the message server 268, the message management server 272 canbe used to control when, if, and how messages are sent to the mobiledevice 100. The message management server 272 also facilitates thehandling of messages composed on the mobile device 100, which are sentto the message server 268 for subsequent delivery.

For example, the message management server 272 may monitor the user's“mailbox” (e.g. the message store associated with the user's account onthe message server 268) for new e-mail messages, and applyuser-definable filters to new messages to determine if and how themessages are relayed to the user's mobile device 100. The messagemanagement server 272 may also, through an encoder 273, compressmessages, using any suitable compression technology (e.g. YKcompression, and other known techniques) and encrypt messages (e.g.using an encryption technique such as Data Encryption Standard (DES),Triple DES, or Advanced Encryption Standard (AES)), and push them to themobile device 100 via the shared network infrastructure 224 and thewireless network 200. The message management server 272 may also receivemessages composed on the mobile device 100 (e.g. encrypted using TripleDES), decrypt and decompress the composed messages, re-format thecomposed messages if desired so that they will appear to have originatedfrom the user's computer 262 a, and re-route the composed messages tothe message server 268 for delivery.

Certain properties or restrictions associated with messages that are tobe sent from and/or received by the mobile device 100 can be defined(e.g. by an administrator in accordance with IT policy) and enforced bythe message management server 272. These may include whether the mobiledevice 100 may receive encrypted and/or signed messages, minimumencryption key sizes, whether outgoing messages must be encrypted and/orsigned, and whether copies of all secure messages sent from the mobiledevice 100 are to be sent to a pre-defined copy address, for example.

The message management server 272 may also be adapted to provide othercontrol functions, such as only pushing certain message information orpre-defined portions (e.g. “blocks”) of a message stored on the messageserver 268 to the mobile device 100. For example, in some cases, when amessage is initially retrieved by the mobile device 100 from the messageserver 268, the message management server 272 may push only the firstpart of a message to the mobile device 100, with the part being of apre-defined size (e.g. 2 KB). The user can then request that more of themessage be delivered in similar-sized blocks by the message managementserver 272 to the mobile device 100, possibly up to a maximumpre-defined message size. Accordingly, the message management server 272facilitates better control over the type of data and the amount of datathat is communicated to the mobile device 100, and can help to minimizepotential waste of bandwidth or other resources.

The MDS 274 encompasses any other server that stores information that isrelevant to the corporation. The mobile data server 274 may include, butis not limited to, databases, online data document repositories,customer relationship management (CRM) systems, or enterprise resourceplanning (ERP) applications. The MDS 274 can also connect to theInternet or other public network, through HTTP server 275 or othersuitable web server such as an File Transfer Protocol (FTP) server, toretrieve HTTP webpages and other data. Requests for webpages aretypically routed through MDS 274 and then to HTTP server 275, throughsuitable firewalls and other protective mechanisms. The web server thenretrieves the webpage over the Internet, and returns it to MDS 274. Asdescribed above in relation to message management server 272, MDS 274 istypically provided, or associated, with an encoder 277 that permitsretrieved data, such as retrieved webpages, to be compressed, using anysuitable compression technology (e.g. YK compression, and other knowntechniques), and encrypted (e.g. using an encryption technique such asDES, Triple DES, or AES), and then pushed to the mobile device 100 viathe shared network infrastructure 224 and the wireless network 200.

The contact server 276 can provide information for a list of contactsfor the user in a similar fashion as the address book on the mobiledevice 100. Accordingly, for a given contact, the contact server 276 caninclude the name, phone number, work address and e-mail address of thecontact, among other information. The contact server 276 can alsoprovide a global address list that contains the contact information forall of the contacts associated with the host system 250.

It will be understood by persons skilled in the art that the messagemanagement server 272, the MDS 274, the HTTP server 275, the contactserver 276, the device manager module 278, the data store 284 and the ITpolicy server 286 do not need to be implemented on separate physicalservers within the host system 250. For example, some or all of thefunctions associated with the message management server 272 may beintegrated with the message server 268, or some other server in the hostsystem 250. Alternatively, the host system 250 may comprise multiplemessage management servers 272, particularly in variant implementationswhere a large number of mobile devices need to be supported.

The device manager module 278 provides an IT administrator with agraphical user interface with which the IT administrator interacts toconfigure various settings for the mobile devices 100. As mentioned, theIT administrator can use IT policy rules to define behaviors of certainapplications on the mobile device 100 that are permitted such as phone,web browser or Instant Messenger use. The IT policy rules can also beused to set specific values for configuration settings that anorganization requires on the mobile devices 100 such as auto signaturetext, WLAN/VoIP/VPN configuration, security requirements (e.g.encryption algorithms, password rules, etc.), specifying themes orapplications that are allowed to run on the mobile device 100, and thelike.

According to one aspect, we provide a method and system for groupingcontexts from a given context model together to create a new contextmodel that has fewer contexts, but retains acceptable compression gainscompared to the context model with more contexts.

According to an exemplary embodiment consistent with this aspect, a setof files that are correlated to the file to be compressed (hereaftercalled training files) are read to determine, for an initial contextmodel, the empirical statistics of contexts and symbols. In someembodiments, this includes determining the estimated joint andconditional probabilities of the various contexts and symbols (or blocksof symbols). Examples of an initial context model include, but are notlimited to, the previous l symbols, any combination of the previous lsymbols, any combination of bits from the previous l symbols, or thecontext model derived from a previous iteration of this method andsystem.

The initial context model is then reduced to a desired number ofcontexts, for example, by applying a grouping function g to the originalset of contexts to obtain a new and smaller set of contexts.

For example, consider a data sequence x₁, x₂, . . . , x_(r), (consistingof one training file or a concatenation of several training files) alongwith its respective context sequence c₁, c₂, . . . c_(m) generated fromthe initial context model. Determine their corresponding n^(th) orderempirical statistics which, say, are represented by the jointprobability distribution P(X₁, X₂, . . . , X_(n), C₁, C₂, . . . , C_(n))of random symbols X₁, X₂, . . . , X_(n) and their corresponding contextsC₁, C₂, . . . C_(n). If the initial context model satisfies that thenext context is determined by the current context and current symbol, inother words, there is a function f such that the next context c_(i+1) isc_(i+1)=f(x_(i), c_(i)) for i=1, 2, . . . m, then one can apply CBYKalong with the initial context model to compress the data sequence Theresulting CBYK compression rate, when the total length m is largeenough, is given as

r<=H(X ₁ , . . . ,X _(n) |C ₁)/(n)+o(log(m)/m)

where H(X₁, . . . , X_(n)|C₁) is the conditional entropy of X₁, . . . ,X_(n) given C₁. To reduce the memory requirements, a grouping function gis applied to the original set of contexts {c¹, c², . . . , c^(j)} toobtain a new and smaller set of contexts {ĉ¹, ĉ², . . . , ĉ^(k)}.Accordingly, we want to choose a mapping function g:

g:ε={c ¹ ,c ² , . . . ,c ^(j) }={circumflex over (ε)}{ĉ ¹ ,ĉ ² , . . .,ĉ ^(k)}

wherein k<j.Embodiments of the invention relate to choosing a mapping function gwhich satisfies:

g(f(x,c))=g(f(x,c′)) whenever g(c)=g(c′)  (1)

for any symbol x and any c, c′ε{c¹, c², . . . , c^(j)}, and keeps theprocessing requirements and the compression rate r for CBYK withinacceptable limits, wherein

r<=H(X ₁, . . . ,X_(n) |g(C ₁))/(n)+o(log(m)/m)

One method of finding a grouping g with the above property such that ris minimized is to find a g such that H(X₁, . . . , X_(n) g(C₁)) isminimized among all group functions with the above property. Accordingto one embodiment, this is done by grouping contexts together on aniterative basis, and finding a local minimum for each iteration.

In some embodiments, an iterative procedure is executed to incrementallyreduce the number of contexts to the desired number of contexts. Acontext model including a reduced set of contexts is then used by both acompression encoder to compress the file, and by a corresponding decoderto decompress the file. For example, referring back to FIG. 4, anencoder can be located at one or all of the servers such as mobile dataserver 274, or message management server 272, and the correspondingdecoder on mobile device 100, thus facilitating the transmission of databetween various servers and the mobile device 100. Similarly an encodercan be present at the mobile device 100 and a corresponding decoder atone or all of the servers with which the mobile device communicates. Itshould be noted that the encoder and decoder can be co-located (forexample, if compression is used to reduce storage space) or can belocated in separate devices (if compression is used for transmissionpurposes).

To incrementally reduce the number of contexts to the desired number ofcontexts, a statistical analysis is performed on a large number oftraining files. “Large number” preferably implies a sufficient number offiles that adding one additional file does not significantly change theempirical statistics. In one exemplary embodiment a thousand trainingfiles was used. In some embodiments the joint and conditionalprobabilities are calculated. As different categories of data files mayhave different structures and different recurrences of contexts,embodiments of the invention will use different categories of trainingfiles dependent on a file categorization criteria of the data file to becompressed. File categorization criteria can include file type, contenttype, language, and file structure. Note that each category of data file(for example, web page, email, word document, spreadsheet, executable,picture, multi-media file, blog, portal, etc.) will all have the sameinitial context model including the original set of contexts—for examplethe initial context model may use the previous single byte as thecontext for the current symbol. However, the different categories oftraining files may have different probabilities, which would in turnresult in a different set of reduced contexts in the end. Accordingly,assigning an initial context length, which is typically set equal to apredetermined number of bits (or bytes), depends on the filecategorization criteria.

According to one embodiment, such a context model is developed inadvance of, and then subsequently used by, a compression algorithm whichuses the context model to compress a data file. For example, a mappingfile M which represents a mapping from all original contexts to thereduced context set corresponding to the grouping function g isdetermined in advance, and used by both a CBYK encoder and thecorresponding CBYK decoder.

According to one embodiment, the mapping file M which represents thegrouping function g is determined in advance, and used by both a CBYKencoder and the corresponding CBYK decoder. For example, assume that aCBYK encoder is located at a server A storing Data file X, and thecorresponding CBYK decoder is located at terminal B which requires Datafile X. To facilitate transmission of X from A to B, X is compressed bythe CBYK encoder at A, and then decompressed by the CBYK decoder at B.However, neither the CBYK encoder at A, nor the CBYK decoder at B needto execute the algorithm to determine the context model. Instead,according to an embodiment of the invention, the context model isdetermined in advance and stored at both A and B. Thus, A and B onlyneed to implement the CBYK compression algorithm itself, based on saidcontext model.

Alternatively, rather than creating such a context model in advance,given sufficient processing power, the context model can be continuouslyupdated at both A and B to provide a more reliable and up-to-datecontext model. Another alternative is the continuous update of thecontext model at A. This can be implemented at A and transmitted to B,prior to the transmission of the compressed data. Yet anotheralternative is to periodically update the model at A, for example aftersome number of messages, and then transmitting the updated context modelfrom A to B. While the new context model is being generated, theprevious context model is used, and the updated context model is notused until it has been transmitted to B, for example in a separatemessage.

FIG. 5 is a flow chart illustrating method steps according to oneembodiment. In this embodiment the process takes into account the factthat the joint and conditional probabilities can depend on the categoryof file to be compressed, as discussed above.

Accordingly, once the system receives a file to compress 1000, it firstattempts to determine the category of file 1010. The system then usesthe category of file to determine an initial context set. This initialcontext set has a context length l 1020, which defines an initial numberof contexts to be used based on the category of file. The system thensets the value of a variable representing the current number of contextsto the initial number of contexts.

Similarly, the system will then determine the joint and conditionalprobabilities for the different initial contexts and symbols (or blocksof symbols) based on the file type 1030. Note that the joint andconditional probabilities can be determined, for example, by gatheringempirical statistics for that data type from a set of training files1040 which represents a large sample data set from a number of files ofthat category. In some embodiments, only the second order empiricalstatistics are used in order to reduce memory requirements. However, thesystem and methods are not limited to the second order, and one can alsocollect and apply the n^(th) order empirical statistics where n is aprescribed parameter.

A desired number of contexts is then assigned 1050. Note the desirednumber of contexts is a parameter supplied based on memory requirements.Generally, this number represents a tradeoff between the amount ofmemory required for the compression and/or decompression process and theamount of compression (i.e. the more memory that is available for use,the greater the amount of compression—within limits). The systemdetermines whether the current number of contexts exceeds the desirednumber of contexts 1060. If so, the system reduces the contexts 1070until the number of contexts equals the desired number of contexts. Thesystem then generates a mapping file 1080 representing the reducedcontext set.

Note other determinations can optionally be made based on the categoryof file, for example which “order” (in the sense of first, second, . . ., or n^(th) order) statistics/entropy are used, based on the prescribedparameter.

According to some embodiments of the invention, the context length anddesired number of contexts can be selected based on the type of deviceused—either at the compression or decompression stage. In typicaltransmission situations, the transmitting device compresses the datafile prior to transmission and the receiving device decompresses thereceived file. The limiting factor often depends on the device with thelowest capacity. For example, a handheld mobile communications device islikely to have less capacity (e.g. memory and processing power) than adesktop computer. In addition, we note that the compression process andthe decompression may have different capacity requirements, as the CBYKencoder requires more memory. Thus, the context length and desirednumber of contexts will be selected to be smaller for a mobile devicetransmitting a file (e.g. if it is used as a modem) than if both thetransmitter and receiver are high-end computers.

These factors may also of course depend on the alphabet or language usedin text or text based application files. For example, English usingASCII encoding will only require a single byte to represent everycharacter in the alphabet, whereas other languages with a largeralphabet may require two bytes to represent every character.Accordingly, a single byte may be an appropriate context length for onelanguage, whereas two bytes may be more appropriate for others.Accordingly in some embodiments, determining an initial context modelincludes determining the size of the alphabet used based on thedetermined file categorization criteria and then assigning an initialcontext length equal to a number bits (which may be in form of bytes)based on the number of bits needed to encode all elements of saidalphabet and wherein the initial context set is derived based on thiscontext length.

In some embodiments, each category of data file will have a differentcontext length and a different set of training files, and therefore adifferent set of joint and conditional probabilities.

FIG. 6 is a flow chart illustrating method steps according to anotherembodiment. In particular, FIG. 6 illustrates one implementation forreducing the context set. FIG. 6 illustrates an initialization procedurewhich generates an initial context model and assigns various parametersin steps 2000 to 2040. Steps 2100 through 2180 illustrate an example ofreducing the initial context model to a desired number of contexts(which is defined in 2050). In this example, an iterative procedure isimplemented to combine legitimate groups of contexts together until thecurrent number of contexts equals the desired number of contexts 2060.Two groups of contexts are said to be legitimate if, when they arecombined, certain properties—for example, the property expressed inEquation (1) in some embodiments—are maintained.

The first step in the initialization procedure involves establishing aninitial context model comprising all of the elements of a context set2000. As the initial context model can depend on the file type, someembodiments generate an initial context model for each file type.Alternatively, a default context model is derived for all file types. Aspart of initialization procedure, a variable defining the current numberof contexts is set to the number of contexts in the initial context set2010. Then, each context within the context set is assigned to a contextgroup 2020 (such that each group initially includes a single context).The joint and the conditional probabilities for the different contextsand symbols (or blocks of symbols) are then collected, for example froma large sample data set 2040. As discussed herein, a desired number ofcontexts based on memory requirements is set. It should be appreciatedthat a default context set with a default initial number of contexts anda default desired number of contexts may be set and used in allapplications. However, as discussed herein, these values may bepreferentially assigned based on the file type and the desired memoryrequirements.

If there are too many contexts 2060, that is to say, the current numberof contexts exceeds the desired number of contexts, the method reducesthe number of contexts by applying a grouping function to the set ofcontexts to combine the contexts into a smaller set of contexts. Notethat a smaller set of contexts refers to a smaller number of contexts.

In the embodiment shown in FIG. 6, a particular grouping function isillustrated which for each current number of contexts chooses twolegitimate groups to combine based on a local minimization scheme. Thisis repeated iteratively, with each iteration attempting to reduce thecurrent number of contexts, until the desired number of contexts isreached. For each iteration, the process starts by creating a temporarycontext model which is obtained by pairing two legitimate context groupsto form a single context group 2100. The process then updates the jointand conditional probabilities 2110 and recalculates the conditionalentropy of the set. If this is the first pairing of two legitimatecontext groups 2130 (for the current number of contexts), then thecurrent pairing of two legitimate context groups is recorded as the pairthat gives the lowest conditional entropy 2150 (for the current numbercontexts). Otherwise, the process determines whether the conditionalentropy associated with the current pairing is lower than the recordedvalue 2140. If it is lower, then the pair is recorded as being the pairof contexts that gives the lowest conditional entropy 2150. Then, thetemporary context model is removed. This process is continued untilthere are no more legitimate pairings for the current number of contexts2160. In other words, once all legitimate pairings for the currentnumber of contexts have been evaluated, then the routine progresses tostep 2170.

The process then reduces the current number of contexts by one, bygrouping the two legitimate context groups that were recorded as beingthe pair of contexts that gave the lowest conditional entropy for thecurrent number of contexts 2170. In other words, the method replaceseach group of the recorded pair with a single group which comprises itsconstitute elements. Thus, the number of contexts is reduced. This stepis repeated until the current number of contexts equals the desirednumber of contexts 2060.

Once the desired number of contexts is reached, the process generates amapping file 2070, which we will refer to as M, which comprises thecurrent set of context groupings.

At this point we should clarify that although each iteration of thegrouping function determines which legitimate pairs of context groupsshould be combined. There is no restriction that each context groupitself be limited to only two of the original contexts of the initialcontext set. In other words during any given iteration, two groups whichhas already been paired may be subsequently paired again while singleelements may remain in the final context set. This can occur for examplewhen there are many different possibilities for one particular context,whereas several other contexts can be combined and still act as a goodpredictor of the next parsed symbol or phrase. As a very simple exampleof this based on an English language alphabet, pairing the punctuationmarks ‘?’ and ‘.’ together to form a single context group can act as agood predictor of the next symbol because the conditional probabilitiesof the following character will not have changed significantly betweenthe new context group and the individual contexts ‘?” and “.”.

In the specific case of embodiments used with CBYK, it is preferable forthe reduced context model to satisfy a transfer function which ensurescertain performance properties of the CBYK algorithm in all cases. Inthis specific example, let C be an arbitrary (finite or infinite) set ofcontexts. In a general context model, for any sequence x=x₁x₂ . . .x_(m) drawn from an alphabet A, there is a context sequence c₁c₂ . . .c_(m) derived from C. In this context sequence, each c, can bedetermined from x₁x₂ . . . x_(i−1) and c₁ in some manner. For contextmodels called state machine context models, the transfer function can bewritten as

c _(i+1) =f(x _(i) ,c _(i)), i=1,2, . . .

where c₁εC is an initial context and assumed to be fixed, and f is amapping which will be referred to as a next context function.

If the reduced context set creates a situation where the next contextcannot be derived from the current context and the current parsedphrase, then the related performance properties of CBYK are no longernecessarily true for all sources. Accordingly, this transfer functionimposes an additional restriction, i.e., Equation (1), on the groupingfunction from which the resulting context model is derived, in someembodiments.

Accordingly, the mapping file preferably only includes context groupswhich satisfy this transfer function. This can be implemented in severalways.

One way to satisfy this requirement is to start with an initial contextset that will always satisfy the transfer function (even when it isreduced). One example where this is true is when the last byte is usedas the initial context model. Accordingly, some embodiments utilize thelast byte as the initial context model. This simplifies processing, asit removes the necessity to check whether the transfer function issatisfied. Also, for cases which are constrained by memory requirements,using the last byte as the initial context model uses less memory thancontext models with longer context lengths.

Alternatively, one can satisfy the transfer function requirement by onlygrouping contexts together which satisfy such a function. For example,one embodiment starts with an initial context set that satisfies thetransfer function requirement and ignores pairings that will violate thetransfer function.

As another alternative, another embodiment can start with any initialcontext set. Then after the initial set is reduced such that the desirednumber of contexts is satisfied, the set is evaluated to determinewhether the transfer function requirement is satisfied. If not, thesystem continues reducing it until the transfer function requirement issatisfied.

According to one embodiment, the previous byte is used as the initialcontext model for a compression algorithm. As a byte consists of 8 bits,there are 256 possible values (2⁸). Thus, the initial number of contextsis 256. So {0, 1, . . . , 255} represents the complete set of initialcontexts for a compression algorithm, which uses the previous byte asthe initial context model. Let us assume that the memory requirements ofa particular device make it desirable to reduce this number ofcontexts—for example to 64 contexts. According to an exemplaryembodiment we produce a grouping function g which maps the original setof contexts (from 0 to 255) to a new set of contexts using the notation:

g:{0,1, . . . , 255}→{circumflex over (ε)}={ĉ ¹ ,ĉ ² , . . . ĉ ⁶⁴}

The function g satisfies Equation (1), and the next reduced context canbe determined from the current reduced context ĉ_(i) and the currentsymbol x_(i) after x_(i) is parsed.We will now discuss a couple of methods of determining the next reducedcontext.

One method is to simply use a two dimensional array. For the aboveexample, each current symbol x_(i) takes values from {0, 1, 2, . . .255}, and each current reduced context ĉ_(i) takes values from{circumflex over (ε)}. Accordingly, one can simply use a 64*256 array tofind the next reduced context, with each element of the array giving thenext reduced context to use in response to the current reduced contextĉ_(i) and the current symbol x_(i).

Another embodiment utilizes the mapping function g not only to group thecontexts into a reduced set of contexts to be used in CBYK, but is alsoused to determine the next reduced context ĉ_(i+1)=g(c_(i+1)), wherein(c_(i+1)) can be determined after x_(i) is parsed. For the abovespecific example, doing this means the system only needs a onedimensional array of size 256, with each element of the array giving thenext reduced context in response to any possible value c_(i+1)ε{0, 1, .. . , 255}, rather than an array of size 64*256.

FIG. 7 is a block diagram which illustrates an apparatus for performingCBYK compression using a reduced context set, according to oneembodiment. The apparatus 1200 consists of a parser 1206, acontext-dependent grammar updating device, 1210, a context-dependentgrammar coder, 1214 and a Context Generator 1270. The parser 1206accepts as input an input string 1204 and a context generated by thecontext generator 1270, and parses the input string 1204 into a seriesof non-overlapping substrings, 1208. The parser causes the transmissionof the substrings 1208 to the context-dependent grammar updating device1210, which accepts the received substrings 1208 and the context fromthe context generator 1270 as input and in turn produces an updatedcontext-dependent grammar G. The context-dependent grammar updatingdevice 1210 transmits the updated grammar G to the context-dependentgrammar coder 1214 which then encodes the grammar G into a compressedbinary codeword 1216.

Meanwhile, one output from the parser 1206 is sent to the contextgenerator 1270, which uses the last parsed phrase and the context of thelast parsed phrase, in conjunction with the mapping file M 1280 (whichfor example is generated by the methods described herein), to producethe current context, which is used for parsing the current phrase of theinput string and for updating the context-dependent grammar G. Forexample, a suffix search is first performed on the current parsed phraseand the current context to find a representative context in the originalunreduced context set and then the mapping file M 1280 is applied to therepresentative context to get a context in the reduced context set,which is the context for the next parsed phrase, which is then sent toboth the parser 1206 and the Context-dependent grammar updating device1210.

FIG. 8 illustrates a method of generating contexts from a reducedcontext set for CBYK compression. Such a method determines the nextoriginal context 1330 from the previous parsed phrase 1300, for examplethe output from Parser 1206 in FIG. 7, and from the previous originalcontext 1320. The output from this stage is the next original context1340, which becomes the previous original context 1320 in the nextiteration. The next original context 1340 is then used to generate thenext reduced context 1350 according to the mapping file (M) 1360, basedon the g function. The next reduced context 1370 is then sent as aninput to both the parser 1206 and the context dependent grammar updatingdevice 1210 in FIG. 7. The parser then generates the next parsed phrase1390, which is then used, along with the next reduced context to updatethe context dependent grammar 1380. The next parsed phrase is also sentto the context generator 1270 as the previous parsed phrase 1300 for thenext iteration.

The apparatus 1200 may be provided as one or more application specificintegrated circuits (ASIC), field programmable gate arrays (FPGA),electrically erasable programmable read-only memories (EEPROM),programmable read-only memories (PROM), programmable logic devices(PLD), or read-only memory devices (ROM). In some embodiments, theapparatus 1200 may be implemented using one or more microprocessors, andan appropriate computer program product embodied in a machine-readablemedium storing instructions, which when executed by a processor,implements the functions of the blocks shown. It should be noted thatapparatus 1200 can form part of the encoders 273 and/or 277 of FIG. 4.However, as previously stated, other embodiments of the invention arenot limited to a transmission system, and can be used for data storageand retrieval within a single computer or network.

It should be readily apparent to a person skilled in the art that FIG. 7illustrates an embodiment for compressing data string, and that acorresponding decompression system uses the mapping function and thegrammar to decompress the binary code word to reconstruct the originaldata string. Such a system can be implemented using similar blocks, andin a similar manner to the apparatus 1200, except the decompressionsystem will not need a parser. As but one example, such an apparatus canform part of the decoder 103, or alternatively can be located within anetwork server or the host system 250. Furthermore, once again we notethat such a decompression system can be used for data storage andretrieval within a single computer or network.

The above-described embodiments of the present invention are intended tobe examples only. Alterations, modifications and variations may beeffected to the particular embodiments by those of skill in the artwithout departing from the scope of the invention, which is definedsolely by the claims appended hereto.

What is claimed is:
 1. A method of generating a context model forcontext-based compression, the method comprising: determining empiricalstatistics for a file type of a file to be compressed; and generatingthe context model by iteratively grouping contexts of an initial contextmodel in accordance with the empirical statistics, the context modelhaving fewer contexts than the initial context model.
 2. The method asclaimed in claim 1, wherein the determining the empirical statisticsincludes determining joint, conditional, and unconditionalprobabilities.
 3. The method of claim 1, further comprising generating amapping file for mapping data elements to a set of contexts of sizeequal to a size of the context model.
 4. The method of claim 3, whereinthe determining the empirical statistics comprises: determining filecategorization criteria for the file type; determining the initialcontext model including an initial value for a current number ofcontexts and an initial context set based on the file categorizationcriteria; and determining, for the initial context model, the empiricalstatistics of contexts and symbols.
 5. The method as claimed in claim 4,wherein the generating the context model by iteratively groupingcontexts comprises iteratively using the empirical statistics toincrementally reduce the number of contexts until the number of contextsequals a predetermined number of contexts.
 6. The method as claimed inclaim 4 wherein the generating the context model by iteratively groupingcontexts comprises applying a grouping function to the context set tocombine the contexts into a smaller set of contexts.
 7. The method asclaimed in claim 4 wherein the determining the initial context modelcomprises determining an initial context length dependent on the filecategorization criteria, and wherein the initial context set is derivedbased on the initial context length.
 8. The method as claimed in claim 4wherein the determining the initial context model comprises determininga size of an alphabet used in the file based on the determined filecategorization criteria and then assigning an initial context lengthequal to a number of bits based on the number of bits needed to encodeall elements of the alphabet; and wherein the initial context set isderived based on the initial context length.
 9. The method as claimed inclaim 6 wherein the applying a grouping function comprises: creatinggroupings of contexts from the initial context set; calculating aconditional entropy for each grouping of contexts; selecting a reducednumber of groupings based on the calculated conditional entropy; andreducing a size of the initial context set by replacing elements whichcomprise the selected groupings with the groupings.
 10. The method asclaimed in claim 6 wherein the applying a grouping function comprises:iteratively, until a size of the set of contexts equals thepredetermined number of contexts, performing: creating groupings ofcontexts from the initial context set; calculating a conditional entropyfor each grouping of contexts; selecting a grouping with a lowestconditional entropy; and reducing a size of the set of contexts byreplacing elements which comprise the selected grouping with thegrouping.
 11. The method as claimed in claim 10 wherein the selectingthe grouping with the lowest conditional entropy comprises determiningjoint, conditional, and unconditional probabilities from a large set ofdata files having the same file categorization criteria as thedetermined file categorization.
 12. The method as claimed in claim 10wherein the selecting the grouping with the lowest conditional entropycomprises determining the nth order empirical statistics from a largeset of data files having the same file categorization criteria as thedetermined file categorization, where n is a prescribed parameter. 13.The method as claimed in claim 10 wherein the context model is used in acontext-dependent grammar based compression process for compressing asequence by parsing the sequence, and wherein the context set is suchthat a next context is determined from the current context and a currentparsed phrase.
 14. The method as claimed in claim 10 wherein the initialcontext model is a state machine context model and wherein the initialcontext model is used in a context-based YK compression process forcompressing a sequence x=x1x2 . . . xm by parsing the sequence, andwherein the context set is chosen such that a next context from thecontext model is determined from the current context and a currentparsed phrase.
 15. The method as claimed in claim 10 wherein the initialcontext model is a state machine context model and wherein the contextmodel is used in a context-based YK compression process for compressinga sequence x=x1x2 . . . xm by parsing the sequence, and wherein the stepof creating grouping only creates groups such that when the groups arecombined, a next context from the reduced set of contexts can still bedetermined from the current context and a current parsed phrase.
 16. Themethod as claimed in claim 1 wherein the file categorization criteriainclude at least one of content type, language, and file structure. 17.A compression system, comprising: a processor configured for generatinga context model for context-based compression by determining empiricalstatistics for a file type of a file to be compressed; and generatingthe context model by iteratively grouping contexts of an initial contextmodel in accordance with the empirical statistics, the context modelhaving fewer contexts than the initial context model.
 18. The system ofclaim 17, wherein the processor is further configured to generate amapping file, for mapping data elements to a set of contexts of sizeequal to a size of the context model.
 19. The system of claim 17,wherein the processor is configured to determine the empiricalstatistics by: determining file categorization criteria for the filetype; determining the initial context model including an initial valuefor a current number of contexts and an initial context set based on thefile categorization criteria; and determining, for the initial contextmodel, the empirical statistics of contexts and symbols.
 20. The systemas claimed in claim 17, wherein determining the empirical statisticsincludes determining joint, conditional, and unconditionalprobabilities.